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Here, I present an elementary proof of Zorn's Lemma under the Axiom of Choice, simplifying and 

(NT 



^ supplying necessary details in the original proof by Paul R. Halmos in Naive Set Theory^. 



q^o start with, I assume knowledge of basic Set Theory, i.e., axiom of extension, axiom of specifica- 
j ]tion, axiom of pairing, axiom of unions, axiom of powers, ordered pairs, relations, functions, families, 
^ axiom of infinity, numbers, peano arithmetic, order, and the axiom of choice. 
^— > ■ 

a ■ 

g As the Axiom of Choice is central to the proof, here is the description, as given in Halmos's Naive 
i — 'Set Theory: 

> Axiom of Choice: 

^ 'The Cartesian product of a non empty family of nonempty sets is non empty. 

^ Suppose that ^ is a non empty collection of non empty sets. We can convert ^ into an indexed 

^ 'set, by using the collection ^ itself in the role of the index set and using the identity mapping on ^ in 
the role of the indexing. The axiom of choice, then says that the Cartesian product of the sets of ^ has 

^ at least one element. An element of such a Cartesian product is, by definition, a function whose domain 
J> is the index set ( t d?) and whose value at each index belongs to the set bearing that index. Therefore, 

^ an equivalent form of the Axiom of Choice is that there exists a function f with domain ^ such that 
^ if A £ then f(A) £ A. This conclusion applies, in particular, in case ^ is the collection of all non 
empty subsets of a non empty set X. The assertion in that case, is that there exists a function f with 
domain P(X) \ {0} such that if A is in that domain, then f(A) £ A. In intuitive language, the function 
f can be described as a simultaneous choice of an element from each of many sets; this is the reason for 
the name of the axiom. 

1 A Preamble to Zorn's Lemma 

The Statement of Zorn's Lemma is as follows: 

If X is a partially ordered set such that every chain in X has an upper bound, then 
X contains a maximal element. 

Although Zorn's name has been stuck to this Lemma, there were, similar maximal principles (esp. 



1 Paul R. Halmos. Naive Set theory. Van Nostrand Reinhold Company. 1960. Reprint by Martino Fine Books in 
2011 
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by Hausdorff and Kuratowski), before Zorn published his results in 1933. But despite the fact, 
that there are many variants of Zorn's Lemma, they are all equivalent to each other and to the 
Axiom of Choice. 

So, how did the term "Zorn's Lemma" come to be? Mycielski attributed to Semadenj^], the fol- 
lowing convincing explanation: 

Namely, in Science, the consumer decides upon the name of the tools which he uses, and the 
consumer is not always the best informed person. 

Nonetheless, there is no doubt that Zorn provided a great service in directing attention to the 
largely unrealized potential in maximal principles. 

As an example, Zorn's Lemma can be used to prove that every non zero vector space has a 
basis. We consider the set of all linearly independent subsets of the given vector space V, par- 
tially ordered by inclusion. Let Y be a chain of linearly independent subsets of V. We note that 
the union of such a set can serve as an upper bound for it. To apply Zorn's lemma, we have 
to check whether the union is linearly independent. If ti,...,t n belong to the union, then each 
ti belongs to some linearly independent set Li G Y. Because Y is a chain, one of these sets L, 
contains all the others. If that is Lj, then the linear independence of Lj implies that no non 
trivial linear combination of t\, . . . ,t n can be zero, which proves that the union of the sets in Y 
is linearly independent. Therefore, by Zorn's lemma, there is a maximal linearly independent 
set. Such a set is not just linearly independent, but also spans the whole space, since if it didnt 
we could just pick an element that did not belong to its linear span and we could add it to the 
linearly independent set, contradicting maximality. 

2 Proof of Zorn's Lemma: 

Proof. 

First of all, the empty chain 0, is a chain in X and by the hypothesis of Zorn's Lemma, must have 
an upper bound, say z, in X. As a result, 1^0 and so, we can permit each element of X to be an 
upper bound of the empty chain 0. 

In all that follows, 1^0. 

Let s be the function form X to P(X), given by s(x)={y G X : y < x}, and S=ran(s) be partially 
ordered by inclusion. 

Then s is a one-to-one function, as, if not, 3x, y G X with x ^ y, and s(x) = s(y). But as s(x) = s(y), 
we have x < y and y < x. Therefore, x=y, which is a contradiction. 

If 3x, y G X with x < y, then s(x) C s(y), as all elements less than x are less than or equal to y. On 
the other hand, if s(x) C s(y), then x < y because x G s(x). 
Thus, we have the following Lemma. 

Lemma 1. A necessary and sufficient condition for s(x) C s(y) is x < y. 

This, immediately gives us the next Lemma. 

Lemma 2. If3x G X such that x is maximal in X, then s(x) is maximal in S and vice versa. 

Let S£ be the set of all chains in X, ordered by inclusion. Then, ^" ^ as singletons and in 
P(X) are chains in X. 

Suppose that & £ J. By the hypothesis of Zorn's Lemma, & has an upper bound in X, say z. 
Then, W C s(z). 



2 P.J. Campbell. The origin of "Zorn's lemma". Historia Math. , 5 (1978) pp. 7789 
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Let <€ be a chain in 9C , and Y be U{C :Cetf}. 

If ^ = 0, then Y = 0(the empty chain in X), which is in fact in 3£ '. 

Otherwise, 3x,yG Y, and 3C x ,C y G with x E C x and y G C y . Moreover, ^ being a chain in SE , 
either C x C C y or C y C C x . 

If C x C Cy, then x,yG C y and therefore, x and y are comparable. 

If C y C C y) then x,yG C x and therefore, x and y are comparable. 

The conclusion is that Yg S& for every chain in . 

Also, for all C G ^ , C C Y . Hence, Y is an upper bound for ^ in SE . 

Therefore, 

Lemma 3. Every chain in 2£ has an upper bound in S£ , the union of the elements of the chain being 
one. 

We consider the case, when 3E has a maximal element, say N. As N is a chain in X, it has an upper 
bound in X, say n. Therefore, N U {n} is a chain in X with N C NU {n}. But, as N is maximal in 3£ , 
N U {n} = N. So, n e N and is therefore the greatest element of N. 
Hence, 

Lemma 4. A maximal chain in X has a unique upper bound in S£ , which turns out to be its greatest 
element. 

Note that is not a maximal chain in X, as it is included in every singleton in P(X). 

We continue to study the case, where N is a maximal element of 3E . 

We claim that s(n) is a maximal element of S, where n is the upper bound of N. 

Let, if possible, s(n) not be a maximal element of S. Then, 3t G X, such that s(n) C s(t), i.e., n < t 
by Lemma 1. As iV C s(n), we have N C s(t) and therefore, t is an upper bound of N. 
But, by Lemma 4, t=n, which leads to a contradiction. 
Therefore, using Lemma 2, we have the following conclusion. 

Lemma 5. If N is a maximal chain in X with the upper bound n, then s(n) is a maximal element of S 
and n is a maximal element of X. 

To complete the proof of Zorn's Lemma, it is enough to show that has a maximal element. 

It will be more convenient and revealing to consider a general setup of a set Z C P(X), satisfy- 
ing: 

1) G Z, 

2) if A G Z and B C A, then B G Z, 

3) if V is a chain in Z, U{C : C G ^ } G Z 
and partially ordered by inclusion. 

Note that satisfies these properties, and thus our problem now reduces to proving that there exists 
a maximal element in Z. 

For A G Z, let A = {x G X : A U {x} G Z}. Clearly, Aci 
Note that = {x e X : {x} e Z}. 

If Z = {0}, there exist no singletons in Z, and hence, = 0. Also, if = 0, then there exist no 
singletons in Z, and due to property 2) of Z, Z = {0}. Therefore, = if and only if Z = {0}. 
On the other hand, if Z ^ {0}- Then 3C G Z with C^0. As a result, 3z G C and so {2} C C. From 
Property 2) of Z, {2} G Z. So, = U{C : C E Z} ^ 0. 

As explained in the description of the Axiom of Choice, there exists a function / : P(X) \ {0} — > X, 
such that f(A) G A for all A G dom(f). 
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Define a function g : Z — > Z as follows: 

a) If A \ A ± 0, then g(A)=A U {/(A \ A)}. Here, g(A) G Z, as /(A \ A) G A \ A. 

b) If A \ A = 0, then g(A)=A. 

Also, suppose that g(A)=A but A\ A ^ 0. Then {f(A\A}} G A but by definition, {/(A\A)} G A\A, 

which leads to a contradiction. 

Therefore, A \ A = 0, if and only if g(A)=A. 

Clearly A(Zg(A). 

Also, note that g{0) = f(0) if and only if Z ^ {0}, and #(0) = if and only if Z = {0}. 

Suppose that 3A G Z, such that A \ A = and A is not maximal in Z. Then, there exists a C E Z, 

such that A C C. Therefore, 3x G C, such that rr ^ A. As C G Z, by property 2) of Z, A U {x} G Z 

and so x G A. But, as x G A and x ^ A, so A \ A ^ 0, which leads to a contradiction. 

Also, if A is maximal in Z, there are no elements in X that can be adjoined to A to create a bigger set 

present in Z, and so A = A. 

Hence, we have the following Lemma. 

Lemma 6. A is maximal in Z, if and only if A \ A = , if and only if g(A)=A. 




Note that, although satisfies this requirement if Z = {0}, it does not suit our purpose of find- 
ing a maximal element in X, as, if Z = {0}, then i?f = {0}, which means that X = 0, which is not 
true due to the discussion in the beginning of the proof. Therefore, in further discussion, the search for 
a maximal element in Z will not include 0. 

Define a subset J of Z, to be a tower if: 

1) G J, 

2) if A G J, then g(A) G J, and 

3) if ^ is a chain in J, U{C : C G ^} G J. 

Such a J does exist, as Z itself satisfies these properties. 

Note that J = {0} if and only if Z = {0}. The reason is that, if Z = {0}, then the only sub- 
set of Z which satisfies the properties of a tower is {0}, and if J = {0}, then by property 2) of J, 
g[0) G J and so g(0) = which leads to Z = {0}, as explained before. Therefore, from the discussion 
above the definition of towers, J = {0} is not allowed. 

Also, the intersection of a non empty family {Aj} of towers is a tower, as: 

a) G A for all i, therefore, G fljAj, 

b) if x G HjAj, then x G Aj for all i, and so by condition 2) in the definition of towers, g(x)G Aj for 
all i. Therefore, g(x)G HjAj, 

c) if ^ is a chain in HjAj, then it is a chain in A; for all i and so U{C : C G ^} G Aj for all i, and 
therefore U{C : C G ^} G PljAj. 

Therefore, 

Lemma 7. Lei J G &e the intersection of all towers, then J Q is the smallest tower. 

Let B G J , be called comparable, if for all A G J , either A C B or B C A. Comparable sets do 
exist, as G J and for all A G J , G A. 

Consider a particular comparable element (£. 

Suppose that A C (£. As £ is comparable and g(A)G J D (as A G J ), therefore g(A) C £ or £ C (7(A). 
But £ <?(A) as, if £ C (7(A), then AC£C <?(A), but g(A) has at most one more element than A. 
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Therefore, 

if A C £, then g(A) C £. 

For the particular comparable element £, 

let U={-B G J : B C <£ or g(<£) C -B}, partially ordered by inclusion. 

Then, G C/ as G J G and C C. 

Also, if A G £/, then G £/ due to the following: 
As A G J , g(A) G J . 

Case 1: If A C <£, then (/(A) C <£, as proved above. Therefore, G £/. 
Case 2: If A = £, then g(A) = g(€) and so g(€) C ^(A). Therefore, G £/. 
Case 3: If g(€) C A, then as A C C Therefore, G [/. 

Consider a chain ^ in U. 

For any E e ( rf, E e U and so E can be of two types: 
Type 1: £ct 
Type 2: «?(£) C £. 

We note that the only element of ^ of both Type 1 and Type 2, if any, is €, and that too, if and only 
if £ is maximal in Z, as g(€) C E C € implies that E = <£ = <?(£). 
Let Y=U{C : C G ^}. There are two possibilities for Y: 

Possibility 1: All elements of ^ are of Type 1. Then, all elements of Y are in € and soYct 
Possibility 2: Atleast one of the Cs in ^, say E, is of Type 2. Then, g{£) C E. As E C Y, therefore 
g{€) C Y. 
As a result, Y eU. 

From the previous three paragraphs, U is a tower included in J D . 

But due to Lemma 7, U=J . Therefore, for all A £ J D , A £ U and so A C € or (?(£) C A. 
As A C £ implies that A C g(€), therefore for all A G J Q , A C or #(<£) C A. 
As a result, 

Lemma 8. //(£ is comparable, then g{€) is also comparable. 

Now let C Q be the set of all comparable sets. Clearly, C Q is a chain in J G , as comparable sets are 
comparable with each other. 

1. G C Q . 

2. From Lemma 8, if C G C Q , then g(C) G C G . 

3. Consider a chain ^ in C G . 

If E G ^, then for a particular A G J„, E can be of two types: 
Type 1: Ad E. 
Type 2: £ C A. 

Note that the only element of ^ of both Type 1 and Type 2, if any, is A itself. 
Let Y=U{C : C G ^}. There are two possibilities for Y: 

Possibility 1: All elements of ^ are of Type 2. Then, all elements of Y are in A, i.e., Y C A. 
Possibility 2: Atleast one of the Cs in ^, say E, is of Type 1. Then, A C E. As, £ C Y, therefore 
A C Y. 

Therefore, Y is comparable with A. As A was an arbitrary element of J G , Y is a comparable set, and 
hence, Y G C a . 

From 1), 2) and 3) in the previous paragraph, we infer that C Q is a tower. But by Lemma 7, C Q = J Q . 
Hence, the important result: 

Lemma 9. J Q is a chain in J Q . 
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Now, let A=U{C : C G J Q }. 
Due to Lemma 9 and property 3) of towers, A G J - 
By property 2) of towers, as A e J G , therefore ^(^4) G J - 
Also, as A is the union of all sets in J Q , g(A) C A. 
But as A C y(-A) always, therefore A=g(A). 

Thus, we have obtained the desired result. 



6 



